Previously:
v4.12.
Here s a short summary of some of interesting security things in Sunday s v4.13 release of the Linux kernel:
security documentation ReSTification
The kernel has been switching to formatting documentation with
ReST, and I noticed that none of the
Documentation/security/
tree had been converted yet. I
took the opportunity to take a few passes at formatting the existing documentation and, at Jon Corbet s recommendation, split it up between
end-user documentation (which is mainly how to use LSMs) and
developer documentation (which is mainly how to use various internal APIs). A bunch of these docs need some updating, so maybe with the improved visibility, they ll get some extra attention.
CONFIG_REFCOUNT_FULL
Since Peter Zijlstra implemented the
refcount_t
API in v4.11, Elena Reshetova (with Hans Liljestrand and David Windsor) has been systematically replacing
atomic_t
reference counters with
refcount_t
. As of v4.13, there are now
close to 125 conversions with many more to come. However, there were concerns over the performance characteristics of the
refcount_t
implementation from the maintainers of the net, mm, and block subsystems. In order to assuage these concerns and help the conversion progress continue, I added an unchecked
refcount_t
implementation (identical to the earlier
atomic_t
implementation) as the default, with the fully checked implementation now available under
CONFIG_REFCOUNT_FULL
. The plan is that for v4.14 and beyond, the kernel can grow
per-architecture implementations of refcount_t
that have performance characteristics on par with
atomic_t
(as done in grsecurity s PAX_REFCOUNT).
CONFIG_FORTIFY_SOURCE
Daniel Micay
created a version of glibc s
FORTIFY_SOURCE
compile-time and run-time protection for finding overflows in the common string (e.g.
strcpy
,
strcmp
) and memory (e.g.
memcpy
,
memcmp
) functions. The idea is that since the compiler already knows the size of many of the buffer arguments used by these functions, it can already build in checks for buffer overflows. When all the sizes are known at compile time, this can actually allow the compiler to fail the build instead of continuing with a proven overflow. When only some of the sizes are known (e.g. destination size is known at compile-time, but source size is only known at run-time) run-time checks are added to catch any cases where an overflow might happen. Adding this
found several places where minor leaks were happening, and Daniel and I chased down fixes for them.
One interesting note about this protection is that is only examines the
size of the whole object for its size (via
__builtin_object_size(..., 0)
). If you have a string within a structure,
CONFIG_FORTIFY_SOURCE
as currently implemented will make sure only that you can t copy beyond the structure (but therefore, you
can still overflow the string within the structure). The next step in enhancing this protection is to switch from 0 (above) to 1, which will use the closest surrounding subobject (e.g. the string). However, there are a lot of cases where the kernel intentionally copies across multiple structure fields, which means more fixes before this higher level can be enabled.
NULL-prefixed stack canary
Rik van Riel and Daniel Micay changed how the stack canary is defined on 64-bit systems to always make sure that the
leading byte is zero. This provides a deterministic defense against overflowing string functions (e.g.
strcpy
), since they will either stop an overflowing read at the NULL byte, or be unable to write a NULL byte, thereby always triggering the canary check. This does reduce the entropy from 64 bits to 56 bits for overflow cases where NULL bytes can be written (e.g.
memcpy
), but the trade-off is worth it. (Besdies, x86_64 s canary was 32-bits
until recently.)
IPC refactoring
Partially in support of allowing IPC structure layouts to be randomized by the randstruct plugin, Manfred Spraul and I reorganized the
internal layout of how IPC is tracked in the kernel. The resulting allocations are smaller and much easier to deal with, even if I initially missed a
few needed container_of() uses.
randstruct gcc plugin
I ported grsecurity s clever
randstruct gcc plugin to upstream. This plugin allows structure layouts to be randomized on a per-build basis, providing a probabilistic defense against attacks that need to know the location of sensitive structure fields in kernel memory (which is most attacks). By moving things around in this fashion, attackers need to perform much more work to determine the resulting layout before they can mount a reliable attack.
Unfortunately, due to the timing of the development cycle, only the manual mode of randstruct landed in upstream (i.e.
marking structures with __randomize_layout
). v4.14 will also have the
automatic mode enabled, which randomizes all structures that contain only function pointers.
A large number of fixes to support randstruct have been landing from
v4.10 through v4.13, most of which were already identified and fixed by grsecurity, but many were novel, either in newly added drivers, as
whitelisted cross-structure casts,
refactorings (like IPC noted above), or in a
corner case on ARM found during upstream testing.
lower ELF_ET_DYN_BASE
One of the issues identified from the
Stack Clash set of vulnerabilities was that it was possible to collide stack memory with the highest portion of a PIE program s text memory since the default
ELF_ET_DYN_BASE
(the lowest possible random position of a PIE executable in memory) was already so high in the memory layout (specifically, 2/3rds of the way through the address space). Fixing this required
teaching the ELF loader how to load interpreters as shared objects in the mmap region instead of as a PIE executable (to avoid potentially colliding with the binary it was loading). As a result, the PIE default could be moved down to ET_EXEC (0x400000) on 32-bit, entirely avoiding the subset of Stack Clash attacks. 64-bit could be moved to just above the 32-bit address space (0x100000000), leaving the entire 32-bit region open for VMs to do 32-bit addressing, but late in the cycle it was discovered that
Address Sanitizer couldn t handle it moving. With most of the Stack Clash risk only applicable to 32-bit, fixing 64-bit has been deferred until there is a way to teach Address Sanitizer how to load itself as a shared object instead of as a PIE binary.
early device randomness
I noticed that early device randomness wasn t actually getting added to the kernel entropy pools, so I
fixed that to improve the effectiveness of the latent_entropy gcc plugin.
That s it for now; please let me know if I missed anything. As a side note, I was rather alarmed to discover that due to all my trivial ReSTification formatting, and tiny FORTIFY_SOURCE and randstruct fixes, I made it into the
most active 4.13 developers list (by patch count) at LWN with 76 patches: a whopping 0.6% of the cycle s patches. ;)
Anyway, the v4.14 merge window is open!
2017, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.